Skip to content

feat: foundational support for programmatic tool calling#11508

Draft
roomote[bot] wants to merge 1 commit intomainfrom
feature/programmatic-tool-calling
Draft

feat: foundational support for programmatic tool calling#11508
roomote[bot] wants to merge 1 commit intomainfrom
feature/programmatic-tool-calling

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Feb 17, 2026

Related GitHub Issue

Closes: #11506

Description

This PR attempts to address Issue #11506 by laying the foundational architecture for programmatic tool calling. Feedback and guidance are welcome.

What this PR adds:

  1. Type system extensions:

    • supportsProgrammaticToolCalling flag on ModelInfo schema to detect model capability
    • enableProgrammaticToolCalling toggle on GlobalSettings for user opt-in
    • ApiStreamCodeExecutionChunk stream type for code execution blocks
  2. Model capability detection:

    • Flag set on supported Anthropic models (Claude Sonnet 4.5, Opus 4.x series)
    • Provider-agnostic design: any model can declare support via supportsProgrammaticToolCalling: true
  3. Docker-based sandbox executor (DockerSandboxExecutor):

    • Isolated Python execution in Docker containers with resource limits (CPU, memory, timeout)
    • Read-only root filesystem, no-new-privileges, network disabled by default
    • JSON-based IPC protocol over stdin/stdout for tool call communication
  4. Python tool SDK (ToolBridge):

    • Generates Python functions for supported tools: read_file, write_to_file, execute_command, search_files, list_files
    • Each tool call goes through the IPC bridge back to Roo Code for approval and execution
    • Proper error handling and request ID correlation
  5. Comprehensive tests: 27 tests across 3 test files covering types, ToolBridge, and DockerSandboxExecutor

Design decisions:

  • Docker was chosen over WASM/Pyodide per the issue discussion -- familiar to users and already used for MCPs
  • Provider-agnostic architecture: supportsProgrammaticToolCalling on ModelInfo means any provider can opt in
  • Tools require individual approval even within programmatic execution (per issue requirements)
  • Initial tool subset (5 tools) keeps scope focused while covering the most common operations

Still needed (future PRs):

  • Wire the executor into the task/assistant message processing pipeline
  • Add UI settings toggle in the webview
  • Integrate with Anthropic provider to pass code execution tool type
  • Add webview display for code execution blocks and their results

Test Procedure

Tests were run from the src workspace:

cd src && pnpm vitest run services/programmatic-tool-calling/__tests__/types.spec.ts services/programmatic-tool-calling/__tests__/ToolBridge.spec.ts services/programmatic-tool-calling/__tests__/DockerSandboxExecutor.spec.ts

Result: 3 test files passed, 27 tests passed.

All lint and type-check passes verified via pre-push hook (turbo lint + turbo check-types).

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes (if applicable).
  • Documentation Impact: Documentation updates will be needed once the feature is fully wired up.
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Documentation Updates

This is foundational architecture that is not yet user-facing. Documentation will be needed once the feature is fully integrated in subsequent PRs.

Start a new Roo Code Cloud session on this branch

- Add supportsProgrammaticToolCalling flag to ModelInfo schema
- Add enableProgrammaticToolCalling setting to GlobalSettings
- Set the flag on supported Anthropic models (Claude Sonnet 4.5, Opus 4.x)
- Add ApiStreamCodeExecutionChunk type to the API stream
- Create programmatic-tool-calling service with:
  - DockerSandboxExecutor for isolated Python code execution
  - ToolBridge for generating Python SDK with tool function stubs
  - IPC protocol for tool call communication between sandbox and host
  - Support for read_file, write_to_file, execute_command, search_files, list_files
- Add comprehensive tests (27 passing)

Addresses #11506
@roomote
Copy link
Contributor Author

roomote bot commented Feb 17, 2026

Rooviewer Clock   See task

Reviewed the foundational programmatic tool calling architecture. Found 3 issues in DockerSandboxExecutor, all related to the Docker container lifecycle and IPC handling.

  • Read-only filesystem missing writable tmpfs: --read-only with no --tmpfs /tmp leaves Python with no writable directory, breaking tempfile usage and similar operations.
  • complete message silently discarded: The "complete" case in handleSandboxMessage is a no-op, so the sandbox's structured success/error fields are lost in the normal flow. The close handler falls back to exitCode === 0.
  • Fire-and-forget async tool handling: handleSandboxMessage is not awaited in the stdout data handler. If a timeout fires during an in-flight tool call, the handler writes to a killed process's stdin and the tool result is lost.

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment on lines +217 to +218
"--read-only",
"--security-opt=no-new-privileges",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--read-only makes the entire root filesystem read-only, and the workspace is mounted :ro. This leaves Python with no writable directory at all. Python itself handles missing __pycache__ gracefully, but any user code (or library) that uses tempfile, writes to /tmp, or needs a writable home directory will fail with a read-only filesystem error. The standard pattern for read-only Docker containers is to add --tmpfs /tmp so there's a writable scratch space.

Suggested change
"--read-only",
"--security-opt=no-new-privileges",
"--read-only",
"--tmpfs", "/tmp",
"--security-opt=no-new-privileges",

Fix it with Roo Code or mention @roomote and request a fix.

Comment on lines +299 to +301
case "complete":
// Code execution completed - handled in the close event
break
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "complete" case in the switch is a no-op break, so the success and error fields from the Python script's completion message are silently discarded. The close handler only picks them up if the complete message happens to be the last incomplete fragment in stdoutBuffer (i.e., not terminated by \n). In the normal case, the Python SDK writes the complete message with a trailing newline, so it arrives as a full line, gets parsed here, and is thrown away. The close handler then falls through to exitCode === 0 and never sees the structured error or traceback from the sandbox.

Consider storing the parsed complete message (e.g., in a local variable) so the close handler can use its success/error fields instead of relying on the exit code.

Fix it with Roo Code or mention @roomote and request a fix.

Comment on lines +129 to +136
for (const line of lines) {
if (!line.trim()) {
continue
}
this.handleSandboxMessage(line, proc, toolCalls).catch((err) => {
stderr += `IPC error: ${err.message}\n`
})
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleSandboxMessage is async (it awaits onToolApproval and onToolExecute) but is called fire-and-forget here with only a .catch(). If the timeout fires while a tool approval/execution is in-flight, the promise resolves and proc is killed, but handleSandboxMessage continues running and will attempt proc.stdin!.write() on a dead process. The .catch() swallows the error, but the tool call result being pushed to toolCalls at that point is lost since the outer promise already resolved. Consider tracking in-flight tool requests so the timeout handler can wait for or abort them cleanly, or at minimum guard the stdin.write against a closed/killed process.

Fix it with Roo Code or mention @roomote and request a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Programmatic tool calling

1 participant

Comments